2,042 research outputs found
Recommended from our members
Exercising the NON-VON Primary Processing Subsystem
The Primary Processing Subsystem of the NON-VON supercomputer potentially may comprise thousands of custom nMOS integrated circuits. It is vital that faulty components be detected and located. This paper provides a collection of algorithms to exercise the Primary Processing Subsystem so that the manifestation of latent faults may be observed
Recommended from our members
The NON-VON Supercomputer Project: Current Ideology and Three-Year Plan
While we have learned a great deal during the first two years of the NON-VON Supercomputer Project, I am reluctant to commit myself at this point to anything that might be called a "position" regarding the direction and ultimate outcome of current research in the field of parallel architectures. In part, my hesitation reflects an appreciation for the difficulty of objectively assessing the state of the field as a whole while enmeshed in the "cult of personality" surrounding a particular machine. Fortunately, our local dogma has not yet become so rigid as to preclude the possibility of significant revisions of our beliefs in response to the experiences and ideas of our colleagues. At the same time, it is clear that our understanding of the essential issues of parallel machine design in general is colored by the particular challenges we have faced in the context of the NON-VON Project. In the following discussion, I will thus try to avoid any claims regarding the ideological correctness or historical inevitability of any of the architectural principles to which I now subscribe. In their place, I will attempt to list a few of our current architectural objectives, and to outline our tentative hardware implementation plans for the next three years. Software considerations will not be discussed in this document, despite the fact that they have occupied a large fraction of our time
Recommended from our members
Evolution of the NON-VON Supercomputer
NON-VON is a very high performance experimental supercomputer, portions of which are now being implemented at Columbia University. If efforts are successful, it should be possible to construct NON-VON machines of various sizes that could ultimately support the extremely rapid execution of a wide range of information processing tasks relevant to the defense community in a highly cost-effective manner. This study briefly sketches the most important aspects of NON-VON architecture, identifies current architectural objectives, and describes the phased hardware implementation plan which has been adopted for the next three years
Recommended from our members
The NON-VON Supercomputer
NON-VON is a highly parallel, non-von Neumann supercomputer, portions of which are now being implemented in the Computer Science Department at Columbia University. The machine is intended to support the extremely rapid execution of large scale data manipulation tasks, including relational database operations and many other functions relevant to commercial data processing. The NON-VON architecture includes a tree-structured Primary Processing Subsystem (PPS), which we are implementing using custom nMOS VLSI circuits, along with a Secondary Processing Subsystem (SPS) based on a bank of intelligent disk drives. A high-bandwidth parallel interface provides for rapid data transfer between the two subsystems. This paper describes the organization of the NON-VON machine, with particular emphasis on the structure and function of the PPS. Some of the most important NON-VON programming techniques are then outlined, and their application to typical data processing applications illustrated with simple examples
Recommended from our members
Knowledge-Based Retrieval on a Rational Database Machine
The central focus of this research has been the efficient retrieval of records from very large databases in applications where the criteria for description-matching require deductive inference over a domain-specific "knowledge base." Our approach has involved the design of a specialized non-von Neumann machine which permits the highly efficient evaluation of certain operators of a relational algebra of particular importance to the computational task at logical satisfaction. The architecture permits an O(log n) improvement over the best known evaluation methods for these operators on a conventional computer system, and may also offer a significant improvement over the performance of previously implemented or proposed database machines in other applications of practical import
Recommended from our members
Developing a molecular dynamics force field for both folded and disordered protein states
Molecular dynamics (MD) simulation is a valuable tool for characterizing the structural dynamics of folded proteins and should be similarly applicable to disordered proteins and proteins with both folded and disordered regions. It has been unclear, however, whether any physical model (force field) used in MD simulations accurately describes both folded and disordered proteins. Here, we select a benchmark set of 21 systems, including folded and disordered proteins, simulate these systems with six state-of-the-art force fields, and compare the results to over 9,000 available experimental data points. We find that none of the tested force fields simultaneously provided accurate descriptions of folded proteins, of the dimensions of disordered proteins, and of the secondary structure propensities of disordered proteins. Guided by simulation results on a subset of our benchmark, however, we modified parameters of one force field, achieving excellent agreement with experiment for disordered proteins, while maintaining state-of-the-art accuracy for folded proteins. The resulting force field, a99SB-disp, should thus greatly expand the range of biological systems amenable to MD simulation. A similar approach could be taken to improve other force fields
Recommended from our members
The Semi-Automatic Generation of Processing Element Control Paths for Highly Parallel Machines
This paper describes a recently implemented program that very rapidly generates control paths for different variants of the constituent processing elements of a particular massively parallel machine, the NON-VON Supercomputer. The program, called PLATO, accepts as input a set of instruction opcodes, together with associated control information, and produces as output a functionally correct, highly area-efficient set of PLA's for the processing elements. One novel aspect of the program is its use of a channel routing algorithm to generate a Weinberger Array layout for the OR-plane of the PLA. By supporting extremely rapid generation of processing elements with different instruction sets, PLATO facilitates "rapid turnaround" architectural experimentation of a sort that would otherwise be impractical. Use of the program has already yielded major area and performance improvements in the NON-VON processing element. Many of the techniques employed in the PLATO system should prove applicable to the semi-automatic layout of processing elements for other multiprocessor machines
Recommended from our members
Programming the DADO Machine: An Introduction to PPL/M
DADO (Stolfo and Shaw, 1982) is a highly parallel, tree-structured machine designed to provide significant performance improvements in the execution of Artificial Intelligence software. The DADO prototype, currently being constructed at Columbia University, comprises 1023 processing elements (PE's) each consisting of an Intel 8751 microcomputer chip and an Intel 2186 8K by 8 RAM chip. The PE's are interconnected in a complete binary tree. A full version of DADO would comprise on the order of a hundred thousand PE's each consisting ot a much smaller amount ot local memory, roughly 2K bytes of RAM. (The 8K RAM employed in the DADO prototype was chosen to allow a modest amount of flexibility in designing and implementing the software base for the full version of DADO.) In addition, a specialized combinatorial I/O switch is incorporated in the full DADO design to perform the most basic communication primitives at much higher speeds than is possible with sequential logic, as it is implemented on the prototype machine
Recommended from our members
Architecture and Applications of DADO: A Large-Scale Parallel Computer for Artificial Intelligence
As part of our research on very high performance parallel architectures, we have been investigating; machine architectures specially adapted to the highly efficient implementation of artificial intelligence (AI) software. In the course of our research we designed DADO, a highly parallel, VLSI-based, tree-structured machine, and implemented a high-speed algorithm for production systems on a simulator for DADO. Subsequent research has convinced us that DADO can support many other AI applications, including the very rapid execution of PROLOG programs, and a large share of the symbolic processing typical of contemporary knowledge-based systems. In this brief report, we outline the hardware design of a moderate size DADO prototype, comprising 1023 processing elements, which is currently under construction at Columbia University. We then sketch the software base being implemented on a small 15 processing element prototype system including several applications written in PPL/M, a high-level language designed for specifying parallel computations on DADO
Recommended from our members
SIMD Tree Algorithms for Image Correlation
This paper examines the applicability of fine-grained tree-structured SIMD machines, which are amenable to highly efficient VLSI implementation to image correlation which is a representative of image window-based operations. Several algorithms are presented for image shifting and correlation operations. A particular massively parallel machine called NON-VON is used for purposes of explication and performance evaluation. Although the most recent version of the NON-VON architecture also supports other interconnection topologies and execution modes, only its tree-structured communication capabilities and its SIMD mode of execution are considered in this paper. Novel algorithmic techniques are described, such as vertical pipelining, subproblem partitioning, associative matching, and data duplication that effectively exploit the massive parallelism available in fine-grained SIMD tree machines while avoiding communication bottlenecks. Simulation results are presented and compared with results obtained or forecast for other highly parallel machines. The relative advantages and limitations of the class of machines under consideration are then outlined
- …